Parallel Algorithms for Predictive Modelling

نویسنده

  • MARKUS HEGLAND
چکیده

Parallel computing enables the analysis of very large data sets using large collections of flexible models with many variables. The computational methods are based on ideas from computational linear algebra and can draw on the extensive research on parallel algorithms in this area. Many algorithms for the direct and iterative solution of penalised least squares problems and for updating can be applied. Both methods for dense and sparse problems are applicable. An important property of the algorithms is their scalability, i.e., their ability to solve larger problems in the same time using hardware which grows linearly with the problem size. While in most cases large granularity parallelism is to be preferred, it turns out that even smaller granularity parallelism can be exploited effectively in the problems considered. The development is illustrated by four examples of nonparametric regression techniques. In a first example, additive models are considered. While the backfitting method contains dependencies which inhibit parallel execution it turns out that parallelisation over the data leads to a viable method, akin to the bagging algorithm without replacement which is known to have superior statistical properties in many cases. The second example considers radial basis function fitting with thin plate splines. Here the direct approach turns out to be non-scalable but an approximation with finite elements is shown to be scalable and parallelises well. One of the most popular algorithms in data mining is MARS (Multivariate Adaptive Regression Splines). This is discussed in the third example. MARS has been modified to use a multiscale approach and a parallel algorithm with a small granularity has been seen to give good results. The final example considers the current research area of sparse grids. Sparse grids take up many ideas from the previous examples and, in fact, can be considered as a generalisation of MARS and additive models. They are naturally parallel when the combination technique is used. We discuss limitations and improvements of the combination technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

3-RPS Parallel Manipulator Dynamical Modelling and Control Based on SMC and FL Methods

In this paper, a dynamical model-based SMC (Sliding Mode Control) is proposed fortrajectory tracking of a 3-RPS (Revolute, Prismatic, Spherical) parallel manipulator. With ignoring smallinertial effects of all legs and joints compared with those of the end-effector of 3-RPS, the dynamical model ofthe manipulator is developed based on Lagrange method. By removing the unknown Lagrange multipliers...

متن کامل

Reliability Modelling of the Redundancy Allocation Problem in the Series-parallel Systems and Determining the System Optimal Parameters

Considering the increasingly high attention to quality, promoting the reliability of products during designing process has gained significant importance. In this study, we consider one of the current models of the reliability science and propose a non-linear programming model for redundancy allocation in the series-parallel systems according to the redundancy strategy and considering the assump...

متن کامل

Predictive Load Balancing on Parallel Networks

Networks of workstations are commonly used alternatives to dedicated parallel machines, however one of their major drawbacks is the limitation on the communication bandwidth. In this paper, we use parallel network based techniques to optimize the available band-width. In real life, the data arrival and service rates are stochastic (dynamic) processes. Predictive filtering is a commonly used tec...

متن کامل

Scalable parallel algorithms for surface fitting and data mining

This paper presents scalable parallel algorithms for high dimensional surface fitting and predictive modelling which are used in data mining applications. These algorithms are based on techniques like finite elements, thin plate splines, wavelets and additive models. They all consist of two steps: First, data is read from secondary storage and a linear system is assembled. Secondly, the linear ...

متن کامل

Scalable parallel algorithms for predictive modelling

Data Mining applications have to deal with increasingly large data sets and complexity. Only algorithms which scale linearly with data size are feasible. We present parallel regression algorithms which after a few initial scans of the data compute predictive models for data mining and do not require further access to the data. In addition, we describe various ways of dealing with the complexity...

متن کامل

Load Sharing Control of Parallel Inverters with Uncertainty in the Output Filter Impedances for Islanding Operation of AC Micro-Grid

Parallel connection of inverter modules is a solution to increase reliability, efficiency and redundancy of inverters in Micro-Grid system. Proper load sharing among parallel inverters is a key point. The circulating current among the inverters can greatly reduce the efficiency or even cause instability of the system. In this paper, a control strategy for improving the load sharing performance ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003